Systematic review: YouTube recommendations and problematic content

There has been much concern that social media, in particular YouTube, may facilitate radicalisation and polarisation of online audiences. This systematic review aimed to determine whether the YouTube recommender system facilitates pathways to problematic content such as extremist or radicalising material. The review conducted a narrative synthesis of the papers in this area. It assessed the eligibility of 1,187 studies and excluded studies using the PRISMA process for systematic reviews, leaving a final sample of 23 studies. Overall, 14 studies implicated the YouTube recommender system in facilitating problematic content pathways, seven produced mixed results, and two did not implicate the recommender system. The review's findings indicate that the YouTube recommender system could lead users to problematic content. However, due to limited access and an incomplete understanding of the YouTube recommender system, the models built by researchers might not reflect the actual mechanisms underlying the YouTube recommender system and pathways to problematic content.

This study aimed to investigate whether algorithmic recommendations facilitate pathways to extreme content This study aimed to determine whether the recommender system facilitates pathways to the extreme right content.
This study aimed to analyse racism from an Australian based controversy on social media platforms 2017 133 The implications of venturing down the rabbit whole Kaiser, J., & Rauchfleisch, A. (2019). Modelled the network‚ tendency to create ties by adding the video‚ sentiment towards vaccination as the model‚ primary independent variable. Anti-vaccine videos were the baseline.
Another independent variable was whether videos were connected by a tie share from the same sentiment. Also checked for if videos reciprocally recommended each other.
Public health experts watched all vaccine-related videos and categorized them as pro-vaccine (supported immunization), anti-vaccine (refusal attitude or rejection of vaccines) or neutral sentiment (news media presenting both sides of "debate". Social Network Analysis -Using GEPHI (Crawl Depth = 2). Scraped the recommender system from seed videos and created a nodes and edges list from GEPHI. Conducted a modularity analysis (quality of clustering) to determine communities of videos. Assigned Eigenvector centrality to each node (node) -determines the number of connections a node has to other nodes, the higher the number, the higher its importance to the network.

Radicalised Content
Criteria: Channel has over ten thousand subscribers. More than 30 percent of the content on the channel is political.

channels
Conducted a social network analysis and estimated the number of times a video was recommended to a user (impressions). Assessed the number of impressions a video was receiving.
Channels were tagged. The soft tags were: conspiracy, libertarian, anti-SJW, social justice, white identitarian, educational, late-night talk shows, partisan left, partisan right, anti-theist, religious conservative, socialist, revolutionary, provocateur, Men Rights Activists, Missing Link Media, state-funded, anti-whiteness. Three labellers allocated tags to each channel.
If two or more labellers defined a channel by the same label, that label was assigned to the channel. An intraclass coefficient was used to determine the agreement amongst labellers. Assembled thirteen aggregate groups that represented the political perspectives of the included channels.

Extreme Right Content
Two data sets associated with extreme-right English language and German language Twitter accounts were generated, by retrieving profile data over an extended period.
English Channels 26,460 and 3,046 German 1. Aggregation process to rank seed channels. 2. Generated TF-IDF channel document vectors and identified topics using NMF. 3. Categorised the classified topics concerning the set defined in Top 10 recommended videos for incel related videos and control sets collected from YouTube API. The researchers built a directed graph with nodes (videos) and edges (recommendations). Measured the prevalence of incel related videos in the network. The researchers Calculated the out-degree in terms of incel-related and Other labelled nodes.
To measure if YouTube facilitates incel-related communities, the researchers use a random walker (crawl depth = 5 repeated 1000 times). The start of the walk either starts from an incel related video or an unrelated video. Random walker analysed likelihood of encountering incel content, measured % of incel related video encounter on walks.
They built a lexicon of commonly used words by the incel community (200 terms). Researchers identified incel related terms and included them in the lexicon if they indicated hate, misogyny, or is directly associated with incel ideology. Used these terms to decide if a video was incel related. Annotators looked through video transcripts, titles, tags, and comments.
Classified as Incel related if: one incel-related term in the video and three in the comments (based on F1 scores).

Extremist Content
Two datasetsright-wing populist and politically neutral videos 1,663 German political videos Social Network Analysis -random walk algorithm. The random walk starts from a video from the list of ten initial videos. Then randomly clicks on a recommended video. Then the random walker selects another video from the recommender system. Store the sequence of nodes and their attributes for each run. This process is repeated 5000 times until the videos from the initial dataset have been passed. To measure video content homogeneity, the researchers used the E-I index (identifies the direct links from nodes to their recommendations based on the class).
Values are based on -1 to 1 (-1 = heterogeneous network, 1 = homogeneous network). Social Network Analysis -assigned network centrality measures to each video -betweenness centrality, in-closeness centrality, out closeness centrality, in-degree centrality, and out-degree centrality. Determine the extent to which a video influences the network. They then conducted a t-test to assess the variance of centrality between pro-vaccine and anti-vaccine videos.
Author with a public health background watched 1,984 vaccine-related videos and categorised them by sentiment (pro-vaccine, anti-vaccine, and neutral Conducted a binary classifier to detect how many videos were inappropriate or appropriate for children in the dataset created a directed graph with nodes as videos and edges as recommendations between videos. Checked the number of transitions from appropriate to inappropriate videos and vice versa. Conducted random walks to determine the likelihood of reaching inappropriate content from appropriate content. Random walks were set for ten steps through the recommender system, and each video was classified along the way (repeated 100 times) Created ground truth dataset and then used a deep learning model to detect disturbing videos for children. Manual annotation process: Suitable: Appropriate for children aged 1-5 and relevant to specific age group interest. Disturbing: Sexually suggestive scenes and language, child abuse, and horror.

Pseudoscientif ic Content
Four topics: COVID-19, the anti-vaccine movement,, and the flat earth theory 6.6K unique videos -(1.1K seed videos and 5.5K videos recommended from seed videos).
Crowdsourced Annotators (992) each video presented to three annotators and they decide to label the videos as science, pseudoscience, ethics. Sciencecontent relates to the systematic study of the natural world Pseudoscience -rejects scientific consensus, unfalsifiable, ideas without grounds in scientific methods, explains events as secret plots by powerful forces rather than overt activities or accidents. The classifier then uses snippets, video tags, transcript, and the top 200 comments of a video to detect videos containing pseudoscientific and scientific content.
Crowdsourced Annotators (992) each video presented to three annotators and they decide to label the videos as science, pseudoscience, ethics. Science -content relates to the systematic study of the natural world Pseudosciencerejects scientific consensus, unfalsifiable, ideas without grounds in scientific methods, explains events as secret plots by powerful forces rather than overt activities or accidents. The simulation system's basis is a coordinate system modelled after the initial system's assumed latent space.
Utilised a 256-dimensional unit hypercube; videos were coordinated by their "real" position, following its properties. Also, an apparent position of a video was recorded (the real position with noise). Collected couples of videos that were seen after each other by users and applied a weak force between them in the apparent space. They calculated the expected distance between two random points in a unit hypercube.
Distance and position of videos in the simulation used to determine if the videos are contextually related. If the videos are distant, then they are contextually unrelation and if they are close together, they might be contextually related. Contextually inappropriate content in the context is defined as a video that violates the viewer's assumptions, intentions, and goals or the uploader of a specific video in the context of the current viewing session.  The recommendation network demonstrated a high degree of homogeneity of right-wing populist and politically neutral videos.

Yes
Auditing Radicalization Pathways on YouTube The recommender system facilitated suggestions to Alt-lite and Intellectual Dark Web Content. The study found pathways via the channel recommender system from Alt-lite and Intellectual Dark Web Content to Alt-right content, but not via recommended videos.

Mixed Results
Down the (White) Rabbit Hole: The Extreme Right and Online Recommender Systems More extreme right videos are accessible via the recommender system after watching an extreme right video. The authors introduce the concept of an 'ideological bubble' after just a few clicks.

YOUTUBE & PROBLEMATIC CONTENT
Platformed racism: the mediation and circulation of an Australian race-based controversy on Twitter, Facebook and YouTube The recommender system facilitated controversial humour about the Adam Goodes racism topic and videos by public figures who have shared racist remarks about Aboriginal people.

Yes
The implications of venturing down the rabbit whole Communities of sexually suggestive channels and some of these channels contained indecent videos of children. 50% of these videos were reachable via the recommender system from the seed channels. However, most were ten jumps away. Videos were accidentally identified while analysing Brazilian political videos.